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Abstract 

Genomic selection patterns and hybrid performance influence the chance that 
crop (trans)genes can spread to wild relatives. We measured fitness(-related) 
traits in two different field environments employing two different crop-wild 
crosses of lettuce. We performed quantitative trait loci (QTL) analyses and esti- 
mated the fitness distribution of early- and late-generation hybrids. We detected 
consistent results across field sites and crosses for a fitness QTL at linkage group 
7, where a selective advantage was conferred by the wild allele. Two fitness QTL 
were detected on linkage group 5 and 6, which were unique to one of the crop- 
wild crosses. Average hybrid fitness was lower than the fitness of the wild parent, 
but several hybrid lineages outperformed the wild parent, especially in a novel 
habitat for the wild type. In early-generation hybrids, this may partly be due to 
heterosis effects, whereas in late-generation hybrids transgressive segregation 
played a major role. The study of genomic selection patterns can identify crop 
genomic regions under negative selection across multiple environments and cul- 
tivar-wild crosses that might be applicable in transgene mitigation strategies. At 
the same time, results were cultivar-specific, so that a case-by-case environmental 
risk assessment is still necessary, decreasing its general applicability. 



Introduction 

The chance of crop alleles to introgress into their wild rela- 
tives is highly dependent on genetic and environmental 
selection patterns (Barton 2001; Stewart et al. 2003). For 
crop alleles to become permanently established in the wild 
population affer single hybridization events, hybrid geno- 
types should confer a selective advantage in a particular 
environment (Burke and Arnold 2001; Rieseberg et al. 
2007). Introgression of crop genes into a recipient popula- 
tion starts with Fj hybrids, with equal contributions of crop 
and wild genomes, genome-wide heterozygosity, and strong 
linkage disequilibrium (LD). In subsequent generations, a 
range of new genotypes is formed as a result of recombina- 
tion and segregation in meiosis and the creation of new 
individuals by outcrossing or selfing. However, since the 
genetic background changes rapidly in the first phases of 
the introgression process, selection patterns may differ 



between early- and late-generation hybrids, as well as 
among individual plants within a certain category of 
hybrids (Barton 2001). Such patterns that affect the out- 
come of hybridization are not only interesting from a theo- 
retical point of view (Rieseberg et al. 2000; Burke and 
Arnold 2001) but are also of high interest to Environmental 
Risk Assessment (ERA). Specifically, to what extent geno- 
mic selection patterns can be generalized across different 
cultivars and whether the performance of hybrids differs 
between early- and late-generations and different environ- 
ments (EFSA 2011). 

The performance of crop-wild hybrids can differ 
depending on the cultivar and wild parental lines used to 
produce specific crosses. In experiments employing crop- 
wild hybrids from several crosses with different parental 
lines, variation was found in life history and fitness traits, 
such as germination, seed production and survival between 
different crossing populations in oilseed rape (Hauser et al. 
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1998), sunflower (Mercer et al. 2006) and sorghum (Mu- 
raya et al. 2012). These differences in fitness response 
might also imply that selection acts on different regions in 
the genome. Recently, Quantitative Trait Loci (QTL) analy- 
sis on fitness characteristics measured in field trials has 
been used to study genomic selection patterns in crop-wild 
hybrids (Baack et al. 2008; Dechaine et al. 2009; Hartman 
et al. 2012), but little remains known of how differences in 
life history and fitness traits between different cultivar- 
wild-type crosses translate to differences in genomic selec- 
tion patterns. With the production of high density inte- 
grated and consensus maps it becomes possible to compare 
QTL results between different cultivar-wild-type crosses 
(Hund et al. 2011; Swamy and Sarla 2011). 

After a single hybridization event, several processes play 
a role: hitchhiking effects because of linkage drag, heterosis, 
epistasis and transgressive segregation interact to determine 
hybrid fitness (Stewart et al. 2003; Johansen-Morris and 
Latta 2006) and so influence the introgression chances of 
crop alleles. Epistasis is more thought to contribute to 
hybrid breakdown through the disruption of co-adapted 
gene complexes (Rieseberg et al. 2000), while heterosis and 
transgressive segregation can contribute to an increase in 
the performance of some hybrid lines relative to the wild 
parent (Burke and Arnold 2001). Hence, we focus on the 
latter processes in this study (but see Uwimana et al. 
(2012b) for a study on epistasis in lettuce) and we use two 
distinct hybrid generations: early generation backcross 
(BC) lines in which heterosis and transgression effects can 
occur and Recombinant Inbred Lines (RILs) with only 
transgressive effects. 

Heterosis is most pronounced in early-generation 
hybrids, especially after hybridization between closely 
related species or inbred lines (Rieseberg et al. 2000), 
because of high levels of heterozygosity. Heterosis may be 
due to dominance (masking of deleterious alleles), over- 
dominance (single-locus heterosis) and epistasis (enhanced 
performance of traits derived from different lineages due to 
non-additive interactions of QTL) effects (Rieseberg et al. 
2000). It has been found many times in plants (Rhode and 
Cruzan 2005; Muraya et al. 2012), animals (Hedgecock 
et al. 1995) and insects (Bijlsma et al. 2010). 

Transgressive phenotypes include hybrid plants that 
exceed the parental phenotype in a negative or a positive 
direction (Rieseberg et al. 2000). Transgressive phenotypes 
arise if parental species contain alleles with opposing 
effects, where some lines derive the positively contributing 
alleles from both parents and others derive the negatively 
contributing alleles, leading to hybrid genotypes that are 
more extreme than the parental lines (Lynch and Walsh 
1998). In a review of 171 studies on segregating plant and 
animal hybrids, Rieseberg et al. (1999) showed that in 155 
studies at least one transgressive trait was reported and that 
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44% of 1229 traits examined were transgressive. These 
studies show that both heterosis and transgressive segrega- 
tion are widespread phenomena in hybridizing species 
(Rieseberg et al. 1999, 2003), suggesting that there is a high 
likelihood that at least some crop-wild hybrids have an 
increased fitness relative to the wild type in a given envi- 
ronment (Johansen-Morris and Latta 2006; Latta et al. 
2007). Therefore, rather than estimating average hybrid fit- 
ness, it is necessary to view the entire fitness distribution of 
the hybrid lineages and identify how many individual 
hybrid lineages outperform the wild relative and when. 

In addition to the potentially different response of 
hybrids from different parental lines, or from early- and 
late-generations, hybrid performance is also subject to 
Genotype x Environment (G x E) interactions (Barton 
2001; Hails and Morley 2005). For example, several QTL 
studies that compared hybrid performance between green- 
house and field environments have shown that different 
traits and loci were favoured because of different selection 
pressures (Martin et al. 2006; Latta et al. 2007; Hartman 
et al. 2012). Similarly, hybrid fitness selection patterns dif- 
fer across different natural environments (Weinig et al. 
2003) and as a consequence of varying stresses, such as 
competition (Mercer et al. 2007). This suggests that hybrid 
fitness might be weakly correlated across divergent envi- 
ronments (Latta et al. 2007) and that as a result of these 
G X E interactions different hybrid lineages, and conse- 
quently alleles, might be selected for in different environ- 
ments (Mercer et al. 2007). Moreover, hybridization 
between two wild parental species can lead to the coloniza- 
tion of new habitats previously unavailable to either of the 
parental species (Rieseberg et al. 2007). Therefore, the 
hybrid fitness distributions of different types of crosses and 
generations should also be considered in different environ- 
ments, including the original wild habitat and novel envi- 
ronments, as we have done in this study. 

In this study, we used progeny from different crosses 
between the crop lettuce {Lactuca sativa L.) and the wild- 
type prickly lettuce {Lactuca serriola L.). These species are 
fully cross-compatible and interfertQe without any crossing 
barriers (Koopman et al. 2001). A recent study suggested 
that a substantial part of wUd L. serriola plants in Europe 
(7%) show evidence of previous introgression of alleles 
from L. sativa (Uwimana et al. 2012a). In addition, it was 
demonstrated that compared with the wild parent up to 
four hybrid generations had higher average germination 
and survival rates in the field (Hooftman et al. 2005, 2007, 
2009). Moreover, part of the crop genome was selectively 
advantageous leading to skewed crop-wild allele distribu- 
tions (Hooftman et al. 2011). Although it is often assumed 
that crop alleles confer negative fitness effects in the wild 
habitat (Stewart et al. 2003), this suggests that in lettuce 
parts of the crop genomic background contribute to higher 
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hybrid fitness and, therefore, potentially to the transfer of 
crop alleles to the wild population. 

As different generations, early BC lines as well as late- 
generation RILs were used, originating from different 
parental lines. We employed these hybrid lineages and their 
parents in a location with sandy soil, which is similar to the 
natural habitat in which L. serriola occurs, and one with 
clay soil, which can be considered as a novel habitat given 
the current distribution of L. serriola (Hooftman et al. 
2006). In a previous study, we identified two genomic 
regions under selection in the RILs, one where the crop 
genomic background was selectively beneficial and one 
where the wild genomic background was selectively benefi- 
cial (Hartman et al. 2012). In this study, we extend this 
analysis to the comparison with BC lines employed in the 
same experiment as the RILs and, in addition, studied the 
performance of individual hybrid lineages for both crossing 
types. This design allowed us to study similarities and dif- 
ferences in genomic selection patterns between different 
lettuce cultivar-wild crosses, hybrid performance in early- 
and late-generation hybrids and environmental influence 
on hybrid fitness distributions. We address these specific 
questions: (i) Which crop genomic regions are under posi- 
tive or negative selection and are these similar or different 
between the BC and RIL crossing populations? (ii) Do the 
crop-wild hybrid populations differ in their fitness distri- 
bution and do they include hybrid lineages that perform 
better than the wild parent? (iii) Are there environment 
specific effects on the fitness distributions? In particular, is 
there an indication that introgression is more likely to 
occur in a novel habitat compared to the original habitat of 
the wild relative? Finally, we discuss the likelihood of crop 
gene transfer to the wild relative and the implications for 
ERA procedures. 

Material and methods 

Plant material 

In this study, two different lettuce crop-wild crosses were 
employed. We used 98 lines of an existing RIL population 
(selfed for nine generations) derived from a cross between 
the cultivar L. sativa cv. Salinas (Crisphead) and Califor- 
nian L. serriola (UC96US23; Johnson et al. 2000; Argyris 
et al. 2005; Zhang et al. 2007). In addition, we used 98 
backcross lines selfed for one generation (BCiSi) from a 
cross between the cultivar L. sativa cv. Dynamite (Butter- 
head) and a L. serriola collected near the town of Eys, the 
Netherlands (designated cont83 in Van de Wiel et al. 
(2010); further referred to as L. serriola (Eys). 

Latuca sativa was used as the pollen donor to mimic a 
hybridization event due to pollen flow from the crop to a 
neighbouring wild population. The Fj hybrid plant was 
subsequently backcrossed to the wild-type, creating a BCi 



generation and each BCi was then selfed to create a BCiSi 
population. Crossing followed the protocols by (Nagata 
1992) and (Ryder 1999), and is described in detail in Ho- 
oftman et al. (2005). Note that BCi individuals were geno- 
typed, whereas the BCiSi were used in the experiments (see 
below). 

Both wild L. serriola parents used in the crosses have 
leaves that are long and serrated, and contain a white latex 
substance. Plants develop up to 2 mm long spines on 
downside leaf midribs as well as on the base of the main 
stem. Lactuca serriola develops a rosette and flowers in July 
-August with many reproductive side shoots in the inflo- 
rescence and at the base of the plant. Capitula (flower 
heads) produce approximately 15-20 florets that develop 
into brown single-seeded achenes (further referred to as 
seeds). When seeds are ripe the involucral bracts become 
reflexed. Lactuca serriola occurs predominantly in ruderal 
sites, for example, along roads, railways and construction 
sites. This species is an annual that survives the winter 
mainly as seed, but also occasionally as small rosettes 
(Y. Hartman, field observation). Lettuce mainly reproduces 
by selfing, but research has shown that up to 5% outcross- 
ing rates can be reached via insect pollination (D'Andrea 
et al. 2008; Giannino et al. 2008). 

In contrast, the crop-types of L. sativa used in this study 
do not have spines and leaves are broad instead of serrated 
and do not contain latex. Plants develop a compact head 
instead of a rosette and do not have reproductive side 
shoots at the base of the stem. The cultivar group of Crisp- 
head typically develops a very dense head (de Vries 1997) 
and develops brown seeds, whereas the Butterheads 
develop a relatively loose head and white seeds. Both culti- 
vars have erect involucral bracts when seeds are ripe, most 
likely selected for to prevent seed shattering (de Vries 
1997). 

Experimental set-up and analysis 

This study was conducted in two contrasting field sites. 
The soil at the first site, located in Sijbekarspel (SB), the 
Netherlands (N52°42', E04°58'), consisted of nutrient rich 
and water retaining clay similar to agricultural conditions. 
The second site, located in Wageningen (WG), the Nether- 
lands (N51°59', E05°39'), was similar to the wild habitat 
with dry, nutrient-poor and sandy soil. The weather condi- 
tions during the experiment were not different between the 
two sites (see Table SI). 

For a detailed description of the experimental set-up see 
Hartman et al. (2012). In short, both sites consisted of 12 
blocks, each with all 98 RILs, 98 BCi families and the 
parental lines. Blocks contained 200 squares (40 x 40 cm) 
to which lines were randomly assigned, leading to a total of 
4800 squares. We started the experiment with 30 seeds 
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sown in each square and followed plants during the entire 
life cycle. Squares were thinned leaving one individual to 
reach the adult stage. This means that the data consisted of 
fitness estimates for all 4800 plants (i.e. including survival) 
and on average measurements on 4221 plants for different 
phenotypic traits. 

Statistical and QTL analysis were performed on data of 
traits measured in the field. On the basis of the fitness QTLs 
found, we could distinguish 'fitness QTL genotypes' in 
both RILs and BCs, and compared their fitness distribu- 
tions and the influence of the proportion crop genome. 

Traits measured 

During the experiment, from May until October, we mea- 
sured the following traits related to fitness (Table 1). Ger- 
mination was measured 4 weeks after sowing and biomass 
measurements were done 7 weeks after sowing. Sites were 
visited daily to record the flowering date. At the seed set 
stage, the branches of the main inflorescence and basal 
reproductive side shoots were counted. In addition, we 
counted seeds from ten collected capitula and estimated 
the average number of seeds per capitulum. The number of 
shoots and branches was used to estimate the total number 
of capitula (See Hooftman et al. 2005 and Data SI). Subse- 
quently, seed output was estimated by multiplying the aver- 
age number of seeds per capitulum with the total number 
of capitula. We scored survival as a binary trait with 1 for 
survival until seed production and 0 for individuals that 
either died before seed set or did not complete their life 



cycle before the end of the growing season. We divided the 
number of seed-producing plants per line by twelve to cal- 
culate the survival rate. The final trait, seeds produced per 
seed sown (SPSS) was calculated using the following for- 
mula: 

SPSS = Estimated seed output per reproductive plant 
X Survival x Germination rate 

(1) 

Of all traits, SPSS is the closest estimate of life cycle fit- 
ness and therefore referred to as the 'main fitness trait'. The 
calculation of SPSS is slightly different than in Hartman 
et al. (2012), where we used average survival rate per line 
to calculate SPSS for each square, whereas here we used 
survival (e.g. either 0 or 1). 

Statistical analysis 

We used PASW Statistics 17.0 (SPSS Inc 2009) for the sta- 
tistical analyses. To improve normal distributions all traits 
were transformed, except for number of seeds per capitu- 
lum because this trait already had a normal distribution. 
Proportional data, such as survival and germination rates, 
were arcsine-square-root-transformed. Other traits were 
log-transformed (total number of capitula, number of 
branches, number of reproductive basal shoots and bio- 
mass) or square-root-transformed (SPSS and seed output). 
For each trait, the mean, standard deviation and heritability 
values were estimated. In addition, we also calculated the 
selection differentials for each trait by taking the covariance 



Table 1. Traits studied in a recombinant inbred lines (RILs) population of a Lactuca sativa cv. Salinas x Lactuca serriola (UC96US23) cross and in 
backcross {BC,S,) families of a L. sativa cv. Dynamite x L. serriola (Eys) cross. 



Plant stage 


Trait 


Abbreviation 


Measurement and estimation method 


Seedling 


Germination rate 


GM 


Total number of seedlings divided by 30 seeds sown; seedlings 
counted 4 weeks after sowing; arcsine-square root transformation 


Rosette 


Biomass (g) 


BM 


Average dry weight of two rosettes; log transformation 


Flowering 


Days to first flower (day) 


FLD 


Number of days between sowing and the appearance of the first 
flower; log transformation 


Seed set 


Number of basal reproductive side shoots (count) 


SHN 


Number of basal reproductive side shoots which produced flowers, 
flower buds or seed heads; log transformation 




Number of branches main inflorescence (count) 


BRN 


Number of branches of the main inflorescence; log transformation 




Number of seeds per capitulum 


SDC 


Average number of seeds from ten collected capitula; no 
transformation necessary 




Total number of capitula 


TC 


Total number of capitula, estimation following 
Hooftman et al. (2005, eqn 1); log transformation 




Seed output 


SDO 


Total seed production, estimation following 
Hooftman et al. (2005); square root transformation 




Survival rate 


SUR 


Number of RIL or BC plants with seed production divided by 
twelve; arcsine-square root transformation 




Seeds produced per seed sown 


SPSS 


Number of seeds produced per seed sown, estimated by 
multiplying seed output, survival and germination rate; 
square root transformation 
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between the relative fitness and trait values (both with 12 
data points per RIL or BC line). The relative fitness was 
calculated by dividing SPSS of each plant by the overall 
mean SPSS for a site. 

We used heritability values to assess how much of the 
variation was due to genetic differences. Broad-sense heri- 
tability values [H^) were estimated as the proportion of the 
total variance accounted for by the genetic variance using 
the formula: 



X 100 



With Vg is the genetic variance and is the environ- 
mental variance. Vg and were inferred from between- 
and within-line variance components extracted with proce- 
dure VARCOMP (SPSS Inc 2009). Heritability values of 
family means {Hf ) were estimated using the following for- 
mula (Chahal and Gosal 2002): 



n + (K/„; 



X 100 



Where n is the average number of individuals per line 
measured for a certain trait (Table 2). The latter value indi- 
cates how well the family mean estimate resembles the true 
genetic value, given the number of replicates used, and is 
therefore important for the power of the QTL analyses. 



threshold of a = 0.05 for QTL using 1000 iterations (Doer- 
ge and Churchill 1996). Additive effects and one-LOD sup- 
port intervals were obtained from the CIM results. 
MapChart 2.2 was used to draw the linkage map and QTL 
results (Voorrips 2002). The marker order of LGl, 3, 4, 7 
and 8 of the BC map was reversed to be able to compare 
RIL and BC QTL; 80 markers were similar between the RIL 
and BC map (Fig. 1). 

Fitness distributions 

To visualize variation in fitness for both sites, we ranked all 
98 BC or RIL and parental lines based on the estimated 
average SPSS and plotted the estimated average SPSS of 
lines against their rank. In addition, we visualized the influ- 
ence of major fitness QTL on the fitness distributions. We 
focussed specifically on the genomic regions where BC and 
RIL fitness QTL co-localized across sites. Lines that we 
could unequivocally assign to a certain 'fitness QTL geno- 
type' were colour-coded. Coloured lines had no missing 
data and all flanking markers were of one parental back- 
ground. Colour-codes indicated if fitness QTL contained 
alleles from the crop or the wild parent or a combination 
of both parental lines. We also estimated the average rank 
per fitness QTL genotype indicating if a certain fitness QTL 
genotype had an average high or low rank. 



Quantitative trait loci analysis 

For RILs, the genetic map employed consisted of 1513 pre- 
dominantly AFLP and EST derived SNP markers (http:// 
cgpdb.ucdavis.edu/GeneticMapViewer/display/; map ver- 
sion: RIL_MAR_2007_ratio; Johnson et al. 2000; Argyris 
et al. 2005; Zhang et al. 2007); both map and markers were 
developed by the Compositae Genome Project website 
(http://compgenomics.ucdavis.edu). 

For BC lines, the genetic map consisted of 347 SNP 
markers distributed over nine linkage groups (described in 
detail in Uwimana et al. 2012b). These were selected from 
1083 SNPs, developed by the Compositae Genome Project 
(http://compgenomics.ucdavis.edu/compositae_SNP.php) 
from disease resistance and developmental genes in lettuce, 
using a customized lUumina GoldenGate array with mark- 
ers polymorphic between the parent lines. Note that BCi 
plants were genotyped and that their offspring (BCjSi) was 
used in the experiments. We conducted the QTL analyses 
in QTL Cartographer (version 2.5.008, Wang et al 2010). 
RIL and BCiSi data were analysed separately. We used 
Composite Interval Mapping (CIM) testing at 2 cm inter- 
vals and a stepwise regression method (forward and back- 
ward) with five background cofactors and a 10 cm window. 
Permutation tests were used to estimate a significance 



Influence of the proportion crop genome 

To visualize the influence of the amount of crop genome 
on fitness, we plotted the estimated average SPSS of BCj 
families and RILs against an estimate of the percentage of 
crop genome. This estimate was based on counting markers 
as coming from the crop or wild relative (missing data were 
excluded). The analysis was done for both sites and cross- 
ing types separately and included all 98 RIL or BCi families 
and all parental lines. 

First, we used a univariate linear regression to esti- 
mate the overall relationship between SPSS and the 
percentage of crop genome in R (version 2.14.0; R 
Development Core Team 2011). Second, we repeated 
this analysis, while excluding the effect of the two 
major fitness QTL by adding these as covariates (based 
on the genotype data that were also used for the fitness 
distributions), therefore estimating the relationship 
between the residual variation in SPSS and the percent- 
age of crop genome. In this second analysis, we omit- 
ted genotypes for which the presence of the fitness 
QTL was ambiguous, either due to missing markers or 
a recombination event in the QTL interval. In addition, 
we estimated the average amount of crop genome per 
fitness QTL genotype. 
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LG1-RIL LG1-BC LG2-RIL LG2-BC LG3-RIL LG3-BC LG4-RIL LG4-BC LGS-RIL LG5-BC LG6-RIL LG6-BC 




Figure 1 Positions of quantitative trait loci (QTL) in backcross (BCiS,) families of a Lactuca sativa cv. Dynamite x L. serriola (Eys) cross and a recom- 
binant inbred lines (RIL) population of a L. sativa cv. Salinas x L. serriola (UC96US23) cross using composite interval mapping. Map distances (cm) are 
located on the left side. The same linkage groups of RIL and BC map are shown next to each other; markers are shown as horizontal lines. Linkage 
group names are shown at the top and dotted lines between linkage group bars indicate similar markers. RIL QTL are shown on the left side of link- 
age groups by black or grey bars, whereas BC QTL are shown on the right. Black bars indicate Wageningen QTL and grey bars indicate Sijbekarspel 
QTL. When the crop genomic background (L. sativa) gives a selective advantage (derived from the selection differentials shown in Table 2) the QTL is 
shown as an open bar; when the wild genomic background (L. serriola) gives a selective advantage the QTL is shown as a filled bar. The length of 
QTL bars is determined by the one-LOD confidence interval. Abbreviations are listed in Table 1 . 
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Results 

General survival 

Survival of plants was comparable between sites. For RILs, 
57.1% of plants survived until reproduction at WG and 
56.9% survived at SB (Hartman et al. 2012). A higher per- 
centage of BC individuals survived until reproduction at 
both sites; 80.1% for WG and 72.4% for SB. 

Parental lines 

The main difference between the cultivars and wild paren- 
tal lines is that most crop individuals died before seed pro- 
duction, whereas the majority of wild-type individuals 
survived and produced seeds (Table 2). In both SB and 
WG, only one L. sativa cv. Salinas individual survived until 
flower production, but died before reproductive characters 
could be recorded. Similarly, only one L. sativa cv. Dyna- 
mite individual survived until flower production in SB; in 
WG, four individuals survived until flowering but only one 
of them produced seeds in four capitula. Other trends are 
that crop cultivars had higher germination rates, higher 
biomass production and flowered later compared with the 
wild parental lines of the same cross (Table 2). In addition, 
all parental lines developed faster and flowered earlier in 
WG compared to SB. 

HeritabUity values and selection differentials 

Heritability values patterns were more variable among BC 
lines than among the RILs, consistent with the larger 
genetic variation within and among these lines. For BC 
lines, biomass, number of reproductive basal shoots and 
seed output had the lowest heritability values in SB, 
whereas in WG, number of reproductive basal shoots and 
branch number had the lowest heritability values. At both 
sites, germination showed the highest broad-sense and 
family-mean heritability. For RILs, branch number, bio- 
mass and germination rate showed the lowest broad-sense 
and family-mean heritability values, whereas days until first 
flower showed the highest values at both sites. 

For BC lines, broad-sense heritability values varied from 
6.2% to 30.2% and family-mean heritability values varied 
from 41.8% to 83.9%. For RILs, these varied between 
14.1% and 89.5% and 62.7% and 98.9%, for broad-sense 
and family-mean heritabilities respectively (Table 2), indi- 
cating that the replication level was adequate, given the 
environmental variation under field conditions. 

The majority of selection differentials showed significant 
trends (Table 2), except for BCiSi biomass in SB and WG. 
Across sites and crosses, all selection differentials indicated 
that higher values were favoured, with the exception of 
days to first flower. For this trait lower values were 



favoured, namely 6-7 days earlier flowering for RILs and 5 
-9 days for BCi families. 

Quantitative trait loci analysis 

For the BCj families, we detected a total of 43 QTL 
for ten fitness and fitness-related traits distributed over 
all nine linkage groups (Table 3; Fig. 1). The Pheno- 
typic Variation Explained (PVE) ranged from 6.4% to 
42.8%. One to three QTL were detected per trait 
(mean 2.2) and 1-LOD support intervals varied between 
4.2 and 34.7 cm (mean 13.7 cm). When the two field 
sites are combined for all ten traits, nine QTL were 
detected at both sites; the remaining 25 QTL were 
unique for one of the sites. QTL results of the RIL 
population are summarized in Fig. 1 and are described 
in more detail in Hartman et al. (2012, see Table S2). 
In short, a total of 49 QTL was detected and when the 
two field sites are combined, eleven QTL were found at 
both sites, whereas 27 QTL were unique for one of the 
sites. 

The comparison between RIL and BC QTL fitness clus- 
ters shows similarities but also differences (Fig. 1). For 
both crosses, there were two genomic regions where several 
QTL clustered including QTL for SPSS, the main fitness 
QTL. For the BCi, these regions were located at LG6 (bot- 
tom) and at LG7 (top. Fig. 1). The same QTL are found for 
SB and WG at these genomic locations and in both cases 
selection differentials indicated that the selective advantage 
was conferred by the wild allele for these QTL. At LG6 and 
LG7, the wild genomic background increased SPSS and 
survival rate and reduced days until first flower. At LG7, 
additional QTL were detected for biomass and again a 
selective advantage was conferred by the wild genomic 
background, increasing biomass. 

For the RILs, a fitness cluster was found across sites at 
the bottom of LG5, whereas a second fitness cluster was sit- 
uated at LG7 (Hartman et al. 2012), overlapping the cluster 
found for the BC population. At LG5, QTL for seeds per 
seed sown, seed output and seeds per capitulum were 
detected and a selective advantage was conferred by the 
crop allele (Fig. 1). This region corresponded with BC QTL 
found for seed output, shoot number and total capitula, 
but in contrast to the RIL QTL, no seeds per seed sown 
QTL was found and here the selective advantage was con- 
ferred by the wild rather than the crop allele. At LG7 and 
similar to BC results, a selective advantage was conferred 
by the wild allele QTL for SPSS, survival rate until seed set, 
and days to first flower, indicating that both crop varieties 
contained gene(s) for delayed reproduction. Additional 
RIL QTL found were total capitula, shoot number and bio- 
mass, and for these traits a selective advantage was con- 
ferred by the crop allele. 
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Table 3. Positions of quantitative trait loci (QTL) in backcross (BCiSi) families of a Lactuca sativa cv. Dynamite x Lactuca serriola (Eys) cross using 
composite inten/al mapping. Quantitative trait loci results of the recombinant inbred lines population from a L. sativa cv. Salinas x L. serriola 
(UC96US23) cross are described in detail in Hartman et al. (2012; but see Table S2 for SPSS QTL). A positive additive effect indicates that crop geno- 
mic background {L. sativa) causes higher trait values, whereas a negative additive effect indicates that the wild genomic background (i. serrioia) 
causes higher values. QTL on the same line have peak values within 5 cm. 



Sijbekarspel Wageningen 



LG Trait Position One-LOD interval Additive effect PVE (%) LOD Position One-LOD interval Additive effect PVE (%) LOD 



1 


TC 


70.7 


60.4-75.5 


-0.04 


11.8 


2 


SUR 


30.5 


18.7-42.1 


0.15 


6.4 




SPSS 


32.5 


24.1-41.8 


20.92 


9.4 


3 


TC 
TC 












SDC 


51.2 


35.1-69.8 


-1.93 


20.2 




SDO 


84.0 


72.0-100.2 


-19.63 


19.1 




SHN 












SHN 


155.9 


150.4-158.1 


0.21 


19.5 


4 


SDC 












BRN 


63.3 


61.3-71.6 


0.04 


10.4 




BM 


141.3 


140.6-147.7 


-0.02 


10.1 


B 


BRN 
TC 
SDO 
SHN 










6 


SUR 


91.6 


84.9-92.7 


-0.36 


37.7 




SPSS 


92.7 


84.1-94.7 


-26.80 


16.3 




FLD 


92.7 


83.8-94.7 


0.03 


13.0 


7 


BM 


6.3 


3.1-9.1 


0.04 


24.3 




FLD 


6.3 


2.2-11.0 


0.03 


19.0 




SPSS 


10.4 


8.3-12.9 


-30.58 


20.6 




SUR 


10.4 


6.0-12.9 


-0.30 


26.0 


8 


GM 


4.6 


4.3-11.7 


-0.09 


17.8 




GM 


19.2 


16.9-21.8 


-0.10 


22.4 




FLD 


19.2 


17.2-25.7 


-0.03 


14.8 




BM 












FLD 












BRN 










9 


BRN 
SDC 


0.00 


0.0-4.2 


-0.04 


10.9 




BRN 


19.9 


9.4-34.1 


-0.06 


19.7 




SDO 












SHN 











3.2 
3.0 
3.0 





19.4 


6.4-24.6 


0.04 


13.3 


3.9 




35.2 


31.4-55.0 


0.03 


10.9 


3.2 


4.7 












4.5 














144.6 


141.6-145.8 


0.07 


12.8 


4.2 


4.9 














46.2 


34.6-58.7 


-1.19 


12.7 


4.0 


3.5 


68.6 


63.3-80.0 


0.03 


8.5 


2.9 


3.6 














27.8 


12.8-41.2 


-0.03 


10.6 


3.1 




175.2 


161.8-185.9 


-0.04 


16.1 


3.6 




177.2 


169.2-184.8 


-14.97 


18.7 


4.7 




177.2 


170.0-183.8 


-0.11 


31.2 


7.6 


12.7 


89.6 


84.0-92.7 


-0.39 


42.8 


14.2 


5.7 


92.7 


82.3-94.7 


-30.54 


18.4 


5.9 


4.5 


89.6 


82.5-92.7 


0.03 


27.3 


8.0 


7.5 


3.1 


1.1-6.8 


0.06 


24.5 


7.5 


6.0 


3.1 


1.1-8.3 


0.02 


8.8 


3.1 


6.9 


10.4 


3.1-14.3 


-20.51 


8.5 


3.0 


9.7 


10.4 


6.0-12.9 


-0.31 


26.5 


10.2 


4.5 


3.1 


2.0-8.8 


-0.09 


10.0 


2.7 


6.1 












5.0 














24.5 


17.2-30.0 


-0.04 


11.8 


4.8 




41.5 


35.3-45.3 


-0.02 


11.9 


4.0 




62.7 


55.7-70.6 


-0.03 


13.9 


4.4 


3.2 














15.9 


9.3-28.8 


-1.44 


17.9 


5.8 


5.3 














25.9 


16.9-34.8 


-17.79 


26.5 


6.5 




33.9 


20.1-48.2 


-0.06 


11.8 


3.2 



PVE, Percentage Variation Explained; SPSS, seeds produced per seed sown; Abbreviations are listed in Table 1 . 



Fitness distributions 

Fitness distributions of RIL and BC crossing populations 
differed considerably. AH BC lines had some seed output, 
whereas approximately 30% of RILs produced no seeds in 
SB and WG (Fig. 2). They either died before seed set or did 
not complete their life cycle before the end of the growing 
season. For RILs, the proportion of lines that performed 
better than the wild parent was comparable across sites, 
with 27% in SB and 23% in WG. For BC lines there was a 
considerable difference, with 79% of lines performing bet- 
ter than the wild parent in SB, whereas only 5% performed 
better in WG. 



Given the QTL fitness regions, BC lines with a wild geno- 
mic background for LG6 and 7 (6W-7W) were expected to 
have the highest seed yield, whereas the opposite combina- 
tion (6H-7H; H indicating that BCi genotypes were het- 
erozygous for these loci) should have the lowest seed yields. 
The 6W-7W lines (green bars) are indeed situated at the 
high-end of the fitness distributions, whereas the 6H-7H 
lines (red bars) are situated at the low-end side (Fig. 2). 
This is reflected in the average ranks of 24.0 out of 100 in 
SB and 30.5 in WG for 6W-7W fines, and 78.6 in SB and 
77.9 in WG for 6H-7H lines (Table 4). 

Recombinant Inbred Lines with the crop genomic back- 
ground for LG5 and the wild parental background for LG7 
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Rank Rank 

Figure 2 Fitness distributions across lines for (A) backcross (BCiS,) families in Sijbekarspel (SB), (B) recombinant inbred lines (RILs) in SB, (C) BCqSi 
families in Wageningen (WG) and (D) RILs in WG. Each bar represents one line. Lines are ranked based on the average Seeds Produced per Seed 
Sown. Coloured squares below the x-axis indicate the genotype for genomic fitness regions on LG6 and 7 for BC lines, and LG5 and 7 for RILs; for 
genotype notation, see Table 4. Black squares indicate parent lines and grey squares indicate lines for which the genotype remains unknown. 



(5C-7W) were expected to have the highest fitness. Lines 
with this fitness QTL genotype (blue bars) are indeed 
mostly located at the high-end of the fitness distribution 
(Fig. 2) and had the highest average rank at both sites (27.6 
of 100 in SB and 28.9 in WG, Table 4). RILs with the oppo- 
site combination, 5W-7C (orange bars), mainly situated at 
the low-end of the fitness distribution and had the lowest 
average rank of 76.5 in SB and 73.1 in WG. 

These QTL fitness regions do not explain all variation of 
the fitness distributions as seen by the mixed distribution 
of the coloured bars (Fig. 2). The PVE of the QTL for seed 
production (SPSS) reflects the unexplained variation. The 
combined PVE for BC fitness QTL was approximately 27% 
(WG) to 37% (SB), and for RIL fitness QTL approximately 
30% at both sites, implying that part of the variation went 
undetected. 

Influence of the proportion crop genome 

The average amount of crop genome was 23.7% for the 
BCi lines, ranging from 10.5% to 39.5% (Fig. 3). For RILs, 
the average was 50.9%, ranging from 29.1% to 76.9%. 
There was a large spread in SPSS for both BCiSi families 



and RILs that had approximately the same amount of crop 
genome (Fig. 3A,B). Consequently, for BCiSi families only 
3% (SB) to 7% (WG) was explained by the univariate lin- 
ear regressions. P-values were sig nificant (SB: = 0.03, 
P < 0.05, df = 96; WG: = 0.07, P < 0.01, df = 96). The 
estimated slopes of the linear regression were quite steep, 
with an increase in crop genome from 20% to 30% pre- 
dicted to result in a reduction of 2271 seeds and 4699 seeds 
for SB and WG respectively (based on regression equa- 
tions). For RILs, the explained variance was very low with 
1.0% in SB and 0.4% in WG, and P- values were not signifi- 
cant (SB: R^ = 0.01, P = 0.62, df = 96; WG: R^ = 0.004, 
P = 0.45, df = 96). 

The results of the regression analysis changed consider- 
ably for BCi families when the variation in SPSS due to the 
two major fitness QTL was removed (Fig. 3C,D). The vari- 
ation in SPSS explained by the linear regressions was lower 
and P-values were no longer sig nificant (SB: R^ = 0.02, 
P = 0.14, df = 74; WG: R^ = 0.01, P = 0.96, df = 74). For 
RILs, the explained variance was even lower and non-sig- 
nificant. 

For RILs, the average amount of crop genome was simi- 
lar across fitness QTL genotypes (Table 4: 49.8-52.1%). 
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Table 4. Average rank and amount of crop genome of four genotypes 
(based on QTL of the main fitness trait seeds per seed sown) across 98 
recombinant inbred lines (RILs) or backcross (BCiSi) families. 





Average rank 




% crop 


No. of 


Genotype 


Sijbekarspel 


Wageningen 


genome 


lines 


BC ■] S-] families 










6H-7H 


78.6 


77.9 


31.0 


16 


6W-7W 


24.0 


30.5 


21.0 


13 


6H-7W 


51.9 


52.7 


25.1 


27 


6W-7H 


56.9 


46.9 


25.8 


20 


No genotype 


34.6 


42.7 


25.4 


22 


RILs 










BC-7C 


52.9 


51.7 


52.1 


21 


BW-7W 


51.3 


53.1 


51.0 


23 


BC-7W 


27.6 


28.9 


50.2 


16 


5W-7C 


76.5 


73.1 


52.0 


13 


No genotype 


47.8 


48.2 


49.8 


25 



C, homozygous crop allele; W, homozygous wild allele; H, heterozy- 
gous crop and wild allele; QLT, quantitative trait loci. 
For RILs, letters indicate genomic fitness regions on LG5 and 7 and for 
BC lines, letters indicate genomic fitness regions on LG6 and 7. For 
example, 5C-7C indicates crop genotype for the identified QTL on both 
LG5 and LG7; lines without sufficient information are joined into 'No 
genotype'. No. of lines = number of BC or RIL lines in each category 
(each line with 12 replicates per site). % crop genome = average% of 
markers derived from the crop parent (BCi or RIL). 

The most advantageous BCi fitness QTL genotype (6W- 
7W) had the lowest amount of crop genome (21.0%), 
whereas the least advantageous BCi fitness QTL genotype 
(6H-7H) had the highest (31.0%), indicating that selection 
in this BCi population might lead to a considerable purg- 
ing of crop genes at these genomic locations. 

Discussion 

Overlapping and separate genomic regions are under 
selection 

Quantitative trait loci results under field conditions may 
vary from site to site and genetic material used (Mercer 
et al. 2006; Muraya et al. 2012). In our case, the crop culti- 
var, as well as the wild parent, differed between the BC and 
RIL crossing population. Given this context, it is perhaps 
surprising that we found several key genomic regions 
affecting fitness traits in both crossings and environments, 
next to a number of substantial differences. 

Both the BC and RIL populations had two genomic 
regions, one co-localized and one specific for each cross, 
with fitness QTL that were consistent across field sites. Fit- 
ness distributions and the average rank of fitness QTL 
genotypes (based on fitness QTL) confirmed that these 
genomic regions indeed had a substantial impact on the fit- 
ness of BC and RIL hybrid lineages. The majority of lines 
with the most selectively advantageous fitness QTL 



genotype displayed relatively high seed yields and averaged 
these groups showed the highest rank compared with other 
combinations of parental alleles. This pattern with few 
genomic regions of major impact is similar to QTL selec- 
tion patterns found in slender wild oat (Latta et al. 2010) 
and in sunflower (Baack et al. 2008; Dechaine et al. 2009). 

Seeds produced per seed sown QTL co-localized at 
the top of linkage group (LG) 7 for both BC and RILs. 
The selection differentials showed that the selective 
advantage was conferred by the wild allele, by favouring 
a higher SPSS, early flowering and higher survival rates. 
This QTL region is probably the result of the presence 
of a major gene for flowering, in which the crop allele 
confers a selective disadvantage by delaying bolting 
(Hartman et al. 2012). The second genomic region 
under selection was specific for each cross, with BC fit- 
ness QTL on the bottom of LG6 and RIL fitness QTL 
on the bottom of LG5. For BC QTL at LG6, it was 
again the wild allele that gave the selective advantage 
favouring earlier flowering, higher survival rates, and 
higher SPSS. These did not co-localize with any RIL 
QTL. In contrast, for the RIL QTL cluster of LG5, it 
was the crop allele that favoured SPSS, seed output 
and seeds per capitulum (Hartman et al. 2012). 

Genetic basis of better performing lines 

At both field sites and for BC, as well as RIL crossing popu- 
lations, there was a substantial number of hybrid lines that 
outperformed their respective wild parent, although 
hybrids on average produced less seeds per seed sown than 
the wfld parent, with the exception of BC hybrids on clay 
soil that performed better than the wild parent (see below). 
This observed hybrid vigour concurs with the transgressive 
segregation observed in greenhouse experiments employing 
the same BC and RILs hybrid lineages, in which individual 
lines had an increased vigour under drought, nutrient limi- 
tation and salt stress (Hartman 2012; Uwimana et al. 
2012b). 

Heterosis, increased hybrid vigour in early-generation 
hybrids (Rieseberg et al. 2000; Johansen-Morris and Latta 
2006), probably explains, for the larger part, that all BCiSi 
families produced at least some seeds, even though these 
hybrids where backcrossed once to one of the parents. In 
contrast, approximately 30% of RILs produced no seed 
output. With each subsequent generation, heterozygosity 
rapidly decreases in a selfing species. Hence, a lettuce RIL 
population selfed for nine generations lines are virtually 
entirely homozygous and heterosis effects cannot account 
for the better performing lines in later generations (Burke 
and Arnold 2001). However, the higher fitness of early-gen- 
eration lettuce hybrids may favour survival of hybrids with 
novel genotypes, thereby increasing the chances for these 
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(C) 
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Figure 3 Relationship between tine amount of crop genome (%) on tine average Seeds Produced per Seed Sown (square-root-transformed) for eacin 
backcross (BC,) family and recombinant inbred line (RIL). (A, B) simple regression of fitness on crop genome%, and (C, D) residual regression after 
the effects of the two major fitness quantitative trait loci were taken out, as covariates; Sites: Sijbekarspel (A and C) and Wageningen (B and D). Dots 
indicate BC lines and triangles indicate RIL averages. Regression equations: 



(A) BCi 


: y = 


129 A - 1.12x,P = 


0.03, J?^ 


= 0.03; RIL : 


y = 


58.9 - 0.25x,_P = 


0.62, = 


0.01; 


(B) BCi 


: y = 


176.0 - \ J9x,P = 


0.004, 


= 0.07; RIL 


■y = 


= 93.6- 0.50x,P = 


= 0.45, i?^ = 


= 0.004; 


(C) BCi 


: y = 


-21.3 + 0.83x,P = 


0.14, 


= 0.02; RIL 


r = 


-4.83 + 0.09x,P 


= 0.84, 


= 0.01; 


(D)BQ 


■ y = 


-0.90 + 0.03x,P = 


: 0.96, 


= 0.01; RIL 


y = 


-0.87 + 0.02x,_P 


= 0.98, 


= 0.01 



beneficial novel genotypes to be fixed in later generations 
(Johansen-Morris and Latta 2006; Latta et al. 2007). 

The steep decline in fitness of BCi families with a higher 
amount of crop genome indicates there might be a strong 
selection against and hence, a rapid elimination of crop 
genome in the first hybrid generations. This could be due 
to hitchhiking effects, since in early-generation hybrids 
many crop genes are in LD with genes under selection, as 
indicated by the lower amount of crop genome of the most 
advantageous BCi fitness QTL genotype (based on fitness 



QTL). In contrast, LD is greatly reduced in 9' generation 
RILs (Flint-Garcia et al. 2003; Stewart et al. 2003). More- 
over, a positively selected crop gene was also segregating in 
the RIL population. In RILs, all genotypes have approxi- 
mately the same amount of crop genome. This suggests 
that in later generations particular combinations of genes 
became important, independent of linkage drag, giving rise 
to transgressive segregation (Rieseberg et al. 1999, 2003). 

Quantitative trait loci studies have consistently pointed 
at the additive effects of complementary genes of the two 



580 



© 2013 The Authors. Published by Blackwell Publishing Ltd 6 (2013) 569-584 



Hartman et al. 



Selection patterns in lettuce crop-wild crosses 



parental species as the most likely underlying genetic basis 
for transgressive segregation (Rieseberg et al. 1999, 2000; 
Burke and Arnold 2001). After hybridization, QTL with 
effects in opposing directions within each parent may 
recombine in the hybrids, resulting in some lettuce hybrids 
having a majority of QTL with positive effects leading to a 
high fitness, or with negative effects leading to a low fitness 
(Lynch and Walsh 1998; Rieseberg et al. 2007). Indeed, six 
to seven (BC and RILs results respectively) of the ten traits 
measured in this study show QTL with opposing effects, 
where in some genomic locations the crop parental allele is 
selectively advantageous and in other locations it is the wild 
parental allele. 

Heterosis, linkage and transgressive segregation are not 
the only genetic processes underlying hybrid fitness. For 
example, Uwimana et al. (2012b) found epistasis effects in 
BCi and BC2 generation lettuce hybrids when subjecting 
these to several stress treatments in greenhouse conditions. 
In later generations, these epistasis effects are more likely to 
contribute to the breakdown of co-adapted gene complexes 
(Rieseberg et al. 2000; Burke and Arnold 2001) and there- 
fore lower hybrid fitness. This may also partly explain the 
30% of RILs without any seed output. 

Our results are based on two L. serriola genotypes, a 
European and an American accession. Genetic diversity in 
L. serriola is considerable (Van de Wiel et al. 2010), so it 
would be desirable to study more wild genotypes, for 
instance, as diallel combinations with crop varieties in 
future studies. 

Higher chance of introgression in novel habitats 

Fitness distributions were different among the two habitats 
used, indicating that introgression of crop alleles through 
hybridization might be more likely to occur in novel habi- 
tats, as opposed to the natural wild habitat of the wild par- 
ent. More hybrid lineages performed better than L. serriola 
in the novel clay soil habitat than in the original sandy soil 
habitat (habitat requirement as described in Hooftman 
et al. (2006)), especially BC hybrid lineages. In spite of the 
fact that the selective advantage for the two BC fitness QTL 
was conferred by the wild allele, 79% of families performed 
better than the wild parent (L. serriola Eys) in clay soil, 
whereas only 5% of BCiSi families performed better in 
sandy soil. The lower performance of the wild parent in the 
clay site was caused by a lower survival until reproduction, 
as well as a lower than average seed yield of reproducing 
plants. In addition, the PVE by fitness QTL (in total 36.9% 
in clay soil and 26.9% in sandy soil) indicates that not all 
fitness variation was explained by these fitness QTL and 
that apparently the increased fitness of BCiSi hybrids in 
clay soil could be due to their mbced crop-wild genomic 
background and heterosis effects. 



It should be noted our experiments included one loca- 
tion of each habitat type, albeit with large differences in 
conditions and replicated plots, but experiments with mul- 
tiple sites for each habitat are needed to see if crop-wild 
hybrid individuals indeed perform better in novel habitats 
compared with the natural wild habitat. This pattern has 
been found in other species. In slender wild oat, more 
hybrid genotypes were able to outperform the parental 
lines in a greenhouse environment, representing a novel 
habitat, than in the original wild habitat (Johansen-Morris 
and Latta 2008). Similarly, radish crop-wild hybrids exhib- 
ited a higher survival rate and produced more seeds per 
plant relative to the wild parent in a new environment, 
whereas they had comparable survival rates but produced 
fewer seeds in the original habitat (Campbell et al. 2006). 
Qur results also concur with those found by Hooftman 
et al. (2005, 2007, 2009), in crossings of the same parents 
as the BC lines of the current study. They found a strong 
heterosis effect in the clay soil averaging over all lines, but 
also a clear hybrid vigour breakdown over multiple genera- 
tions potentially through further segregation or epistasis 
effects. 

Implications for crop breeding and risk assessment 

The genetic processes underlying hybrid fitness have 
important consequences for the chances of crop (trans) 
gene transfer to wild populations and, therefore, for the 
methods of ERA. Many studies on crop-wild hybrid fitness 
use the average fitness of hybrid classes (Halfhill et al. 
2005; Hooftman et al. 2005; Mercer et al. 2006; Campbell 
and Snow 2007; Huangfu et al. 2011); in case hybrid fitness 
is low compared with the wild parent this is taken to sug- 
gest that chances for crop allele transfer are low as well. 
However, our results and those of others indicate that par- 
ticular hybrid genotypes may outperform the parental lines 
under certain environmental conditions (Burke and Arnold 
2001; Johansen-Morris and Latta 2008; Hooftman et al. 
2009). Furthermore, the high and significant selection dif- 
ferentials for fitness traits (including flowering date) and 
the broad-sense heritability values suggest that selection in 
crop-wild hybrid populations can be a dynamic and rapid 
process. Also, although it appears that a larger amount of 
crop genome decreased hybrid fitness, there was consider- 
able spread in fitness among hybrid lines with similar crop 
-wild genomic ratio. Therefore, even if hybrids on average 
have a lower fitness, particular hybrid lines with a large 
amount of crop genome may exist that have a higher fit- 
ness. Thus, a lower average fitness of hybrids does not pre- 
clude gene transfer between crops and their wild relatives. 

In addition, we have found that results can be cultivar- 
specific, that is, the fitness of hybrids depends on the spe- 
cific combination of crop and wild parent and hence. 
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fitness studies for risk assessment should include a range of 
wild parents (Muraya et al. 2012). Similarly, selection pres- 
sures differ across time and place, so ideally risk assessment 
should be performed at several locations and in multiple 
years (Hails and Morley 2005). ERA including hybrids of 
several parental lines, locations and years involves field 
experiments with a huge amount of time and labour. How- 
ever, measuring life history traits can already lead to robust 
conclusions, because through QTL analysis most genomic 
selection patterns can be identified (Hartman et al. 2012). 

Conclusion and way forward 

Our results show that there is a high likelihood in lettuce 
for novel crop-wild hybrids to arise that have a higher fit- 
ness than the wild parent through combinations of hetero- 
sis, linkage and transgressive segregation. This may be 
more likely to occur in novel habitats (Barton 2001). Con- 
sequently, this provides an avenue for introgression of crop 
alleles into the wild population. We did identify a genomic 
region on LG7 where the crop allele induced delayed flow- 
ering that was under negative selection. In this region, 
effects were stable across cultivars and the environments of 
our field experiments and it could therefore be used in 
transgene mitigation strategies. In such a strategy, the 
transgene is closely linked to a region or gene with a strong 
negative selection effect in the habitat of the wild type 
(Gressel 1999; Stewart et al. 2003). 

This study is only a first step to identify the specific genes 
involved, and further work including the creation of Near 
Isogenic Lines (NILs) is being planned. Whether the detri- 
mental effect of delayed flowering is strong enough to pre- 
vent crop (trans)gene escape will be explored further in 
simulation models using these empirical field data. 

Data archiving 

The data used in this article are available as Supporting 
information. 
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